The Condensed Nearest Neighbor Rule

نویسنده

PETER E. HART

چکیده

Since, by (8) pertaining to the nearest neighbor decision rule (NN rule). We briefly review the NN rule and then describe the CNN rule. The NN rule['l-[ " I assigns an unclassified sample to the same class as the nearest of n stored, correctly classified samples. In other words, given a collection of n reference points, each classified by some external source, a new point is assigned to the same class as its nearest neighbor. The most interesting t)heoretical property of the NN rule is that under very mild regularity assumptions on the underlying statistics, for any metric, and for a variety of loss functions , the large-sample risk incurred is less than twice the Bayes risk. (The Bayes decision rule achieves minimum risk but ,requires complete knowledge of the underlying statistics.) From a practical point of view, however, the NN rule is not a prime candidate for many applications because of the storage requirements it imposes. The CNN rule is suggested as a rule which retains the basic approach of the NN rule without imposing such stringent storage requirements. Before describing the CNN rule we first define the notion of a consistent subset of a sample set. This is a subset which, when used as a stored reference set for the NN rule, correctly classifies all of the remaining points in the sample set. A minimal consistent subset is a consistent subset with a minimum number of elements. Every set has a consistent subset, since every set is trivially a consistent subset of itself. Obviously, every finite set has a minimal consistent subset, although the minimum size is not, in general, achieved uniquely. The CNN rule uses the following algorithm to determine a consistent subset of the original sample set. In general, however, the algorithm will not find a minimal consistent subset. We assume that the original sample set is arranged in some order; then we set up bins called STORE and GRABHAG and proceed as follows. 1) The first sample is placed in STORE. 2) The second sample is classified by the NN rule, using as a reference set the current contents of STORE. (Since STORE has only one point, the classification is trivial at this stage.) If the second sample is classified correctly it is placed in GRABBAG; otherwise it is placed in STORE. 3) Proceeding inductively, the ith sample is classified by the current contents of …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An upper bound on prototype set size for condensed nearest neighbor

The condensed nearest neighbor (CNN) algorithm is a heuristic for reducing the number of prototypical points stored by a nearest neighbor classifier, while keeping the classification rule given by the reduced prototypical set consistent with the full set. I present an upper bound on the number of prototypical points accumulated by CNN. The bound originates in a bound on the number of times the ...

متن کامل

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

Edge Detection is an important task for sharpening the boundary of images to detect the region of interest. This paper applies a linear cellular automata rules and a Mamdani Fuzzy inference model for edge detection in both monochromatic and the RGB images. In the uniform cellular automata a transition matrix has been developed for edge detection. The Results have been compared to the ...

متن کامل

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

متن کامل

Using condensed nearest neighbor rule for speaker independent word recognition

متن کامل

A Cluster-Based Merging Strategy for Nearest Prototype Classifiers

A generalized prototype-based learning scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed nearest neighbor classification rule by replacing a group of prototypes by a representative while approximately keeping their original classification power. The algorithm improves and generalizes previous works by explicitly introducing the concept of cluster and...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1967

The Condensed Nearest Neighbor Rule

نویسنده

چکیده

منابع مشابه

An upper bound on prototype set size for condensed nearest neighbor

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

Using condensed nearest neighbor rule for speaker independent word recognition

A Cluster-Based Merging Strategy for Nearest Prototype Classifiers

عنوان ژورنال:

اشتراک گذاری